A standardized image processing and data quality platform for rodent fMRI

Functional magnetic resonance imaging in rodents holds great potential for advancing our understanding of brain networks. Unlike the human community, there remains no standardized resource in rodents for image processing, analysis and quality control, posing significant reproducibility limitations. Our software platform, Rodent Automated Bold Improvement of EPI Sequences, is a pipeline designed to address these limitations for preprocessing, quality control, and confound correction, along with best practices for reproducibility and transparency. We demonstrate the robustness of the preprocessing workflow by validating performance across multiple acquisition sites and both mouse and rat data. Building upon a thorough investigation into data quality metrics across acquisition sites, we introduce guidelines for the quality control of network analysis and offer recommendations for addressing issues. Taken together, this software platform will allow the emerging community to adopt reproducible practices and foster progress in translational neuroscience.

nuisance timecourses (e.g. the 6 realignment parameters) are fitted against the fMRI signal at each voxel to obtain a modelled timecourse representing the confounded portion of the fMRI signal.To obtain an average confound timecourse across the brain, we compute the L2-norm across voxels.The proportion of variance explained by confound regression is also provided.These features allow both to visualize confound effects, and evaluate whether confound regression appropriately modelled confounds detected from other temporal features.f) Mean amplitude of network VS confound timecourses: The averaged timecourse between network analyses and confound sources are compared.This is discussed in the main text (figure 2).g) Spatial distribution in signal variability (BOLD SD ): The first spatial feature of the diagnosis is the signal variability (standard deviation) at each voxel.In sup.figure 4, we observe that signal variability is largely homogeneous without the major influence of confounds, but displays instead the anatomical signature of motion (i.e.anatomical edges) in the corrupted scan.This map thus offers an index of whether significant confounds are contributing to the signal.h) Confound regression variance explained (CR SD ): Here, the variance explained from confound regression is quantified at each voxel by taking the standard deviation from the modelled confound timecourse (see e)).This allows to contrast spatially the amplitude of confound effects.CR SD is the only spatial feature revealing the expected signature of motion along anatomical edges for both uncorrupted and corrupted examples (although more prominent in the corrupted scan).This suggests that this feature can both reliably and specifically reflect the presence of confounds, better so than alternative features.i) Confound regression variance explained proportion: Similar to CR SD , but showing instead the proportion of variance explained (R 2 ).j) Global signal covariance: The correlation with the global signal was used previously to identify globals sources of confounds 1 .Here we select instead the covariance to preserve the inhomogeneous distribution of voxel variance, which is relevant for the detection of confounds in g).By visualizing the spatial contrast in the covariance of each voxel with the global signal timecourse, we can clearly observe the mixed nature of the global signal, which is known to capture the effect from both confounds 1 and neural activity 5 .Indeed, in the uncorrupted scan, the global signal has a predominant contrast along the cortical gray matter, with anatomical features reminiscent of brain networks.
On the other hand, in the corrupted scan, the global signal clearly reflects the spatial contrast of the motion confound, whereas applying confound correction effectively recovers a predominant cortical contrast.This feature can thus offer an index of whether confounding sources or network activity are dominant in the data.k) Network spatial maps: Finally, the diagnosis shows the spatial network maps fitted using dual regression (or seed-based analysis) from the selected set of brain networks of interest (in this case the somatomotor and default mode networks).These fits provide insights into the quality of network analysis, and how they may affect downstream statistical analyses (see main text).Outside of motion spikes, we could derive similar conclusions for two alternative sources of confound related to slow global fluctuations (sup.figure 5).By relying on the two spatial features of signal variability, i.e.BOLD SD and CR SD , we find evidence that one of the confounds is associated to a ventricular origin and the other to a vascular origin, but the temporal features reveal that only the latter significantly influences network estimates from dual regression.In all of these confound cases, the temporal features reveal key mechanistic interactions between the presence of each confound and other aspects of the data, including network analysis.Subsequently, complementing the temporal information, CR SD can identify the specific source of confounds by inspecting its anatomical features within the brain.Finally, the BOLD SD and the global signal covariance can reveal whether signal variability is dominated by confound sources or network activity, and whether confounds persist after applying a correction.These observations demonstrate the complementarity of temporal and spatial features in the interpretation of signal sources, and together, these features allow the diagnosis of most major sources of confounds and can support their correction in rodent fMRI data.

SUPPLEMENTARY DISCUSSION
Comparison of chosen quality control approach with previous methodology by Grandjean and colleagues Grandjean and colleagues first introduced a quality control measure accounting for the presence and specificity of canonical rodent networks 6 .The method consists in measuring two functional connections: 1) a positive correlation between seeds in the right and left somatosensory cortices (S1), and 2) a null or negative correlation between the S1 and anterior cingulate seeds.The former assesses the expected functional connectivity arising from the somatomotor network, and is supported by the presence of known anatomical connectivity.The latter is expected as the ACC belongs to the so-called task negative network, which tends to be anti-correlated with the somatomotor network, and there are no direct axonal connections between those regions.Using this framework, the authors conducted the first multi-site comparison of data quality in mice 6 , and later in rats 7 , where they delineated four types of scan quality outcomes: specific, absent, spurious or unspecific connectivity.This approach is well grounded in biological priors and common notions of canonical network connectivity, while remaining interpretable and easily implemented.
However, while this technique provides important advantages in contrasting connectivity features across datasets, the anatomical constraints of the technique contrasted with the goals of the current manuscript.The RABIES toolkit aims to provide generalizable methods across a range of possible applications by allowing 1) application of the methods to any network/region of interest and 2) capture spurious features which may arise in any brain region.For this purpose, the measure of network specificity through Dice overlap was introduced.The technique can be applied by thresholding any network map, and thus generalizes to any ICA or seed-based connectivity results.Additionally, spurious features arising above threshold in any region will be detected.For instance, a spurious effect in ventral regions of the brain can be well captured (see example in sup.figure 9 for the group report), but would be missed by a priori seeds in the S1 and ACC.The Dice overlap measure is also complemented with additional measures, which are necessary to ensure a complete account of potential pitfalls during quality control (see main manuscript and sup.table 6).

The expectation of network variability specificity
As part of the quality control of network analysis, we put forward the recommendation that inter-subject variability in network connectivity should be predominant within the core anatomical delineation of the canonical network.This recommendation is grounded in observations presented in this manuscript, where datasets with relatively low evidence of confounds and clear network detectability tend to express those features of network variability at the group level (both for dual regression and seed-based connectivity analyses, and for the somatomotor and default mode networks).We argue that benchmarking network variability is important for accurately addressing certain data quality shortcoming that would be missed otherwise.We recognize, however, that the nature of inter-individual variability in network connectivity is not fully understood.In particular, it is not clear how to interpret cases where network variability is absent despite sufficient network detectability at the scan level and no evidence of spurious effects (sup.figure 12a).Such cases raise the question of whether the specific network variability should be necessarily expected.
Providing conclusive evidence on this issue would require a comprehensive assessment of the relationship between network variability as observed through BOLD fMRI and underlying neural metabolism.While we do not aim to reach such conclusions in this study, we emphasize several considerations.First, the specificity of network variability relates strongly to sample size (sup.figure 12b), and thus, absent variability could be indicative of shortcomings in statistical power.Sample sizes were relatively low for most datasets, which may explain the failure to detect such variability for a subset of the datasets.Second, cross-scan variability can transition from absent to specific following the optimization of confound correction (sup.figure 12c).We would thus recommend attempting improving confound correction before concluding that variability cannot be detected.Finally, we are providing here a generic measure of variability (standard deviation across scans) which can be applied on any set of scans, but alternatives could be considered in function of the experimental design.For instance, in the context of studying group differences, inspecting the spatial contrast of group differences (i.e. the group difference reproduces network features) can be a better indicator that the statistical analyses relate to the network of interest.Thus, we do not suggest that the built-in approach within RABIES is the only valid approach for assessing that statistical analyses relate to the network of interest.We do argue however that, provided the susceptibility of fMRI connectivity analyses to false findings, it is preferable to provide evidence that connectivity results relate to expected anatomical features from the network of interest.

Variable impact of confound correction across datasets and the customization of correction strategy on a per dataset basis
To evaluate the ability of confound correction to improve analysis quality outcomes and investigate whether specific strategies can provide generalizable improvements across datasets, the main strategies available within RABIES were all tested systematically across the 19 datasets considered in part 2 of the manuscript (sup.figure 13).More specifically, datasets were initially processed with the regression of 6 motion parameters and censoring with framewise displacement, then one additional correction was applied.The impact of the correction was evaluated across quality metrics part of the quality control guidelines (figure 3 &  4).Outcomes were also evaluated when removing initial corrections (using only censoring, or no correction applied) to inspect potential impacts of the baseline correction and benchmark the ability of quality control metrics in detecting problematic quality outcomes.
Overall, there was important variability in the impact of additional corrections across datasets.Although certain corrections predominantly yielded improvements in quality metrics (i.e.standard, WM/CSF and aCompCor), no single correction led to uniform increase in the number of scans passing network specificity and confound correlation thresholds across all datasets.This conclusion is based on the report shown in sup.figure 13 for the somatomotor network analysis using dual regression, as well as similar reports generated for alternative connectivity analyses shared in the supplementary files.The application of no correction (raw) or only framewise displacement censoring led to an almost uniform decrease in quality, thus justifying the recommendation of regressing 6 motion parameters and applying censoring as a baseline.
This observed variability is reminiscent of the fact that there is no consensus regarding an ideal correction strategy that should be applied across datasets and experiments 8,9 .This is why instead of providing an ideal strategy that should be blindly replicated, we opt for an approach grounded in an understanding of the data and associated research goals.In this case, we put forward guidelines describing how the data quality reports put forward in this manuscript can be leveraged to achieve standard connectivity analysis (sup.figure 147).We argue that such a framework is best for study comparison, over the replication of the exact same methodology, since the quality control measures relate more directly to analysis outcomes and goals.
It is important, however, to note that there can always be a risk of overfitting to meet a priori expectations (since of course quality control metrics have their own limitations, as listed in sup.table 6).For this reason, it remains important for researchers to be cognisant of the interpretation and limitations of those metrics.When comparing dataset, if there are minor and inconsequential dataset differences in the impact of confound correction on quality, it can become preferable to opt for methodological consistency.This was not the case in the current manuscript.

Supplementary table 1:
Core preprocessing modules within RABIES.

Structural inhomogeneity correction
The image is denoised with non-local mean denoising 10 (for EPIs, denoising was carried beforehand during 3D EPI generation) followed by iterative correction for intensity inhomogeneities 11 .Initial masking is achieved via intensity thresholding, giving an initial correction of the image, and a registration is then conducted to register a brain mask for a final round of correction.

Unbiased template generation
Generates a dataset-specific unbiased template from the input structural images.Through a set of iterations, images are registered to a consensus average generated from the overlap of all scans at the previous iteration.Registrations are progressively more flexible, executing 2 iterations each for a rigid, similarity, affine and finally non-linear template generation.The final iteration provides the individual transforms to align each scan to the unbiased template.This template generation process creates a robust target for the alignment of MRI sessions sharing the same acquisition properties, which will minimize registration inconsistencies between sessions, as opposed to the direct registration to an external template.

Atlas registration
A affine and non-linear registration is conducted to align the unbiased template with the structural atlas reference, hence providing the transforms to propagate associated anatomical masks and parcellations.

Generate 3D EPI
The 4D raw EPI file is used to generate a representative volumetric 3D EPI.This volume later becomes the target for motion realignment and the estimation of susceptibility distortions through registration to the structural image.Two iterations of motion realignment to an initial median of the volumes are conducted, then a trimmed mean is computed on the realignment volumes, ignoring 5% extreme, and this average becomes the reference image.The final image is then corrected using non-local means denoising 10 .

Functional inhomogeneity correction
Same as structural inhomogeneity correction, but conducted on the 3D EPI reference.

Susceptibility distortion correction
The input volumetric EPI image is registered non-linearly to an associated structural MRI image.The non-linear transform estimates the correction for EPI susceptibility distortions 12 .

Head motion estimation
This workflow estimates motion during fMRI acquisition.To do so, each EPI frame is registered to a volumetric target reference image with a rigid registration using ANTs' antsMotionCorr algorithm 13 .This results in the measurement of 3 Euler angles in radians and 3 translations in mm (from ITK's Euler3DTransform https://itk.org/Doxygen/html/classitk_1_1Euler3DTransform.html) at each time frame, which are then stored into an output CSV file.

Frame-wise resampling
This module carries out the resampling of the original EPI timeseries into preprocessed timeseries.This is accomplished by applying at each frame a combined transform which accounts for previously estimated motion correction and susceptibility distortion correction, together with the alignment to common space if the outputs are desired in common space.All transforms are concatenated into a single resampling operation to mitigate interpolation effects from repeated resampling.This workflow also carries the resampling of brain masks and labels from the reference atlas onto the preprocessed EPI timeseries.

Registration challenge Solution within RABIES
Parameter selection for the inhomogeneity correction and registration algorithms are dependent on brain size, and are thus inconsistent across rodent species.
We developed a scale-independent registration algorithm, wrapped around antsRegistration 21 , which automatically adapts parameters to species-specific physical dimensions.The provided commonspace target is used to compute brain dimensions, and automatically define the parameters for the inhomogeneity correction and registration operations (https://github.com/CoBrALab/minc-toolkit-extras/blob/master/ants_generate_iterations.py) Poor anatomical contrast led to several EPI masking failures during inhomogeneity correction.
We created an EPI reference atlas using mice from different sites with whole-brain coverage and varying anatomical contrast (methods section 3).Using this atlas as reference for workflows lacking structural images improved substantially the quality of masking during EPI inhomogeneity correction and of alignment from EPI-generated unbiased templates to the reference atlas.
We created an option to skip registration during inhomogeneity correction, and instead rely only on a Otsu thresholding 22 strategy for masking.The number of thresholds derived from the Otsu method can be manually selected to fit the intensity distribution of the brain for a given dataset.
Incomplete brain coverage or the blurring of brain boundaries results in misalignment during template generation and atlas registration.
Using a staged template construction with increasing orders of registration (rigid, similariy, affine and finally non-linear registration) (https://github.com/CoBrALab/optimized_antsMultivariateTemplateConstruction) helped ensure robust initial alignment of images before computing downstream nonlinear template.This was most crucial for datasets with incomplete brain coverage.
During inhomogeneity correction, a preliminary brain mask is computed (see sup. table 1).Following preliminary alignment for the unbiased template generation, these brain masks can be combined into a consensus brain mask (through majority vote in commonspace).This consensus mask can then support registration by computing the registration similarity metric only within the area defined by the mask, and ignoring other areas.This masking strategy allowed improving alignment during unbiased template generation and subsequent atlas registration.
A brain extraction option for registration was added, where at each registration step, brain masks are used to remove outside tissues and enhance brain edge contrast.As in 2. above, the preliminary brain mask from inhomogeneity correction was provided for EPI brain extraction.
For datasets lacking anatomical scans, the EPI-generated unbiased template frequently failed alignment with the DSURQE structural atlas.
In mice, when conducting preprocessing without anatomical scans, the EPI-generated unbiased template is registered to the EPI reference atlas mentioned above instead.

Supplementary table 4:
Preprocessing parameters across datasets.The EPI -Otsu threshold and the EPI masking during unbiased template generation are discussed in sup.table 3.

Category Name Definition
Nuisance regressor

motion parameters
Corresponds to 3 rotations (Euler angles in radians) and 3 translations (in mm) measured for head motion realignment at each timeframe.Prior to the regression, the motion regressors are also subjected to the same frame censoring, detrending and frequency filtering which were applied to the BOLD timeseries to avoid the re-introduction of previously corrected confounds, as recommended in 23 and 24 .

motion parameters
Corresponds to the 6 motion parameters together with their temporal derivatives (calculated by subtracting the parameter with its value at the previous timepoint), and then 12 additional parameters are obtained by taking the squared terms of both the original parameters and their derivatives (i.e.Friston 24 parameters 25 ).As with the 6 motion parameters, the 24 regressors are additionally subjected to censoring, detrending and frequency filtering if applied on BOLD.

WM/CSF/vascular /global signal
The mean signal is computed within the corresponding brain mask (white matter (WM), cerebrospinal fluid (CSF), vascular or whole-brain mask) from the partially cleaned timeseries (i.e. after frequency filtering, steps 1-4 in methods section 5).

aCompCor -Percentage
Principal component timecourses are derived from timeseries within the combined WM and CSF masks (aCompCor technique 26 ).From the timeseries within the WM/CSF masks , a principal component analysis (PCA) decomposition is conducted to derive with corresponding to a set of spatial principal components, and to their associated loadings across time.The set of first components explaining 50% of the variance are kept, and their loadings provide the set of aCompCor nuisance regressors.PCA is conducted on the partially cleaned timeseries (i.e. after frequency filtering, steps 1-4 in methods section 5).
aCompCor -5 components aCompCor as defined above, but the first 5 components are kept instead of a set explaining 50% of the variance.Temporal diagnostic features (sup.figure 4)

Power spectral density
The frequency power spectrum is computed for each voxel timecourse, and the average (and standard deviation) across voxels is displayed.

Carpet plot
All timeseries are displayed into a 2D matrix format, where the rows are all the brain voxels and the columns are the timepoints.

Translation and rotation parameters
Corresponds to 3 rotations (Euler angles in radians) and 3 translations (in mm) measured for head motion realignment at each timeframe.

Framewise displacement (FD)
For each timepoint, corresponds to the displacement (mean across the brain voxels) between the current and the next frame.For each brain voxel within the referential space for head realignment (i.e. the 3D EPI which was provided as reference for realignment (sup.table 1)) and for each timepoint, the inverse transform of the head motion parameters (from the corresponding timepoint) is applied to obtain the voxel position pre-motion correction.Framewise displacement can then be computed for each voxel by computing the Euclidean distance between the positions pre-motion correction for the current and next timepoints.Thus, the mean framewise displacement at timepoint is computed as using the 3D , and spatial coordinates in mm for timepoints and and for each voxel indices .Framewise displacement for the last frame (which has no future timepoint) is set to 0.

DVARS
Represents the estimation of temporal shifts in global signal at each timepoint, measured as the root-mean-square of the timeseries' temporal derivative where corresponds to the BOLD signal in brain voxel at timepoint .The first timepoint is set to 0 (has no previous timepoint).

Edge/WM/CSF mask
The L2-norm across voxels within a mask, at each timepoint.

CR var
The variance estimated by confound regression is computed for each timepoint.This is done by taking the L2-norm across voxels at each timepoints, where is the predicted confound timeseries (see methods section 5).
Represents the proportion of variance explained (and removed) by confound regression.This is obtained with at each timepoint, where and are the timeseries pre-and post-regression respectively, and calculates the variance, with as the mean.

Mean amplitude
A set of timecourse are averaged as , where is the timecourse .Timecourses can correspond to either of the following sets: 1) DR confounds: timecourses from the first stage of dual regression, using confound components (sup.figure 19); 2) DR networks: network timecourses from the first stage of dual regression; 3) SBC networks: network timecourses derived from the set of seeds provided.Spatial diagnostic features (sup. figure 4)

BOLD SD
The temporal standard deviation is computed for each voxel from the BOLD timeseries.

CR SD
The temporal standard deviation computed on each voxel from the predicted confound timeseries during confound regression (i.e. , methods section 5).

CR R 2
The proportion of variance explained by confound regression at each voxel.This is obtained with at each voxel, where and are the timeseries pre-and post-regression respectively, and is variance of , with as the mean.

Global signal covariance (GS cov )
The covariance between the global signal and the timeseries at each voxel is measured as , where , i.e. the mean across all brain voxels for a given timepoint.

DR network X
The linear coefficients resulting from the second regression with dual regression, corresponding to a network amplitude map (for the Xth network specified for analysis).The maps are thresholded to keep the top 4% of voxels with highest absolute value.

SBC network X
The voxelwise correlation coefficients (pearson's r) estimated with seed-based connectivity (for the Xth seed provided for analysis).The maps are thresholded to keep the top 4% of voxels with highest absolute value.

Network amplitude
The overall network amplitude is summarized by computing the L2-norm of the dual regression network amplitude map of linear coefficients.

Network specificity
The network map (seed-based or dual regression) and the corresponding canonical network map are thresholded to include the top 4% of voxels with highest connectivity, and the overlap of the thresholded area is computed using Dice overlap.

Dual regression confound correlation
The timecourse for a single network (from a seed or dual regression) is correlated with the timecourse from each confound component modelled through dual regression, then the absolute mean correlation is computed to obtain the average amplitude of confound Supplementary table 5: Definition of metrics computed within RABIES for nuisance regression, spatiotemporal diagnosis, or quality control metrics computed at the scan level.
correlations for this specific network analysis.
Total CR SD The global standard deviation of the 4D confound timeseries (i.e. , methods section 5).

Mean framewise displacement
The mean framewise displacement computed across time (only including frames after censoring applied for confound correction).

Temporal degrees of freedom (tDOF)
The remaining degrees of freedom post-confound correction are calculated by subtracting the original number of frame by 1) the number of censored frames, 2) the number of AROMA components removed, and 3) the number of nuisance regressors.

Quality control metric
Importance Limitation

Scan-level network specificity
Evaluates that the spatial features of the network are appropriately recovered in each scan.Accounts for network detectability issues, and spurious effects on network shape.
Does not detect spurious effects that only influence network amplitude (and not shape), and high scan-level network specificity does not imply specific network variability (sup.figure 9).

Scan-level temporal correlation with confounds
Controls for spurious connectivity, including network amplitude effects, at the scan level.In the case of strong confounding effects driving network amplitude with little impact on network shape, this measure can best capture spurious connectivity (sup.figure 6).
The selection of confound components is subjective (see classification in methods section 10), and requires expertise to conduct reliably.
If network activity is predominant in a given scan with little confounds, dual regression can overfit confound components to network activity, providing false confound correlations (sup.figure 10).This cautions against interpreting low correlation values, and restricts the application of this metric to the exclusion of highly corrupted scans.
Since lowpass filtering increases temporal autocorrelation 27 , defining an accurate inclusion threshold is more challenging when combined with this filtering technique.

Group-level specificity of network variability
Meeting the scan-level network specificity thresholds is insufficient to ensure that cross-subject variability is specific to the network, as inter-scan variability can differ significantly from the core network features observed in individual scans (sup.figure 9).
Divergences in network detectability between subjects as well as spurious effects on network amplitude with minimal impact on shape can both drive cross-scan variability in a network-specific manner.Thus, to ensure that network variability reflects biological variability of interest instead of divergences in data quality, it must be complemented with other metrics accounting for those effects.

Group-level confound correlation
Detects residual relationships between inter-scan variability in connectivity and confounds which can remain after accounting for scan-level thresholds (sup.figure 11).
Group-level correlations do not necessarily capture large confound effects observable in individual scans (sup.figure 6).
What constitutes a 'concerning' correlation size should be considered in the context of the study, depending on the effect size of interest (i.e. is the effect size of interest much higher or similar to the effect size of confounds?).It is difficult to establish a standardized correlation threshold, as effect sizes can be inflated in low sample size 28 .
The CR SD measure of confound will depend on the input parameters for computing confound regression, and the variance explained by confound regression is influenced by baseline random noise (e.g.scanner thermal noise), and thus may be influenced by factors which are not strickly motion or physiological confounds.

Supplementary table 6:
Complementarity and limitations of the selected set of quality control metrics.

Spatial features
Uncorrupted scan: no confound correction The network variability map is not specific to the network.
This is addressed before the group correlation, as the group correlation may be unreliable before network specificity is ensured (sup.figure 4).

8) Group-level confound correlation
There are significant group-level confound correlations.

table 2 :
23rodent fMRI datasets and results from the quality control of preprocessing.Also, the predominant category for scan data quality markers, as described in figure2, is listed for each dataset included for part 2 of the results.This was determined through visual inspection of the data quality markers in individual scans for each dataset.

table 3 :
Innovations required to offer generalizable registration for rodent fMRI.

Temporal features Corrupted scan: no confound correction Corrupted scan: with confound correction
*** spatial features of motion (i.e.Anatomical edges) *** Cortical global signal contrast *** Standardize variance: this option was only considered if there were signatures of confounds in the BOLD SD map.It is also considered after regression strategies, as this correction does not remove the confound, but only reduce their potential contribution (i.e.reduce the variance of most corrupted voxels).Scan-level thresholds are addressed before the group statistics below to gather as many scans as possible.Distribution report (sup.figure 9) 1. aCompCor 2. Standardize variance 7) Group-level network variability